This one was simple, an obvious place to start. You ask an AI
and you get
which is, actually, quite good. The first line of the first Urcanto is either that, or
Ask again,
and you get
which it is, ‘cept the comma.
Ask for the second line and you might get it.
Ask for the first line of the second Canto and you’ll be off.
with no frame of reference, and nothing but your dusty paper copy to check against.
Within AI, many people call this ‘new’ answer a hallucination. They mean, given a lack of knowledge, something has been made up. To combat hallucination, you’ll often see some sort of append:
giving,
Well that doesn’t really work in ChatGPT (the model in the examples above). But how does ChatGPT know what the first line is? The technology is amazing. There is no ‘lookup’, no connection to the internet. Given that prompt, ‘What is the first line…’, some sort of shift occurs in the nodes, changing the path of least of resistance such that the first most likely token (~ ‘word’) is
the second is
the third is
and so on. There’s not one grab: And then went down to the ship,
There could always be an alternative, but the probability is too low.
How? Because 1) that’s what a neural network is, and 2) because enough text referring to Ezra Pound, his Cantos, Canto I, and line 1 have been passed to the model in training.
All the promise of AI and I can’t get a reliable
Surely this must be fixed. But how? Could we train an AI on the entirety of the Cantos? That would be EXPENSIVE, and the product would still feel unreliable. What next? We could employ RAG — Retrieval Augmented Generation, i.e. bucking up a response using documents, like `Canto I.txt`. And, I’ll admit, this is certainly a place we’ll want to go, especially with the work going on in Library. But I approached this one differently.
Not only is RAG quite a step up (vector stores for documents, …), but The Cantos are a the fundamental document of this system. Getting line 72 when you asked for line 73 is devastating, and I don’t know how well a vector lookup will handle such a thing. I did do research into it, and these vectorised lookups are really good for semantics, not indices. i.e. asking for paragraphs on ‘the sea’ produces a vector, which is then compared to the whole document made into vectors, and voila; this doesn’t handle line numbers.
So the solution was to build just one module of a larger system. It does this: takes a piece of text, “What is the first line of Canto I?” and extracts references:
And that is hardy. I extended my own store of digitised Cantos with a webapp that takes that piece of JSON and returns the lines.
So then what? Well, that’s not the full solution. All we have done is resolve references. But what if the initial question was not asking for a line, but for something else?
Our reference extraction would return the whole of Canto VI. AND THEN what we do is add that to the CONTEXT. Going back to the first example for simplicity:
Now the AI has everything it needs to answer the question reliably.
So, how did it go, and what’s it really good for?
This was my first experience getting stuck into LLMs (my first experiment used BERT, a more singular model). Using only a little bit of fine-tuning, I got my base model (Zephyr 7B β) performing as desired; a huge success, something that bodes well for the future, and an essential tool in the belt. One shame was that I didn’t manage (yet) to do <some quite technical things> that would let me run the fine-tuned model on my own computer. Part of me feels hindered by the current technology, which is great; it means we’re at the front of it and when things become more publicly possible, we’ll be there. What this means is that using this model currently incurs a financial cost, because we have to rent the computer it runs on. There was some expense in fine-tuning the model, around $20 total, largely spent on the learning curve. I expect we’ll see technological improvements in the first half of this year that will make running this model nigh cost free.
And what is it good for? As detailed in /about, goal 2, really. Goal 1 (investigating form) is so hands on that we’re probably working from the Cantos, starting with their lines, not having to look their lines up. But in my current work on goal 1, a deep dive into Canto VI, I have wanted this functionality at hand already, and this experiment producing a little module which can be plugged in to larger systems, it is already proving itself useful (if cd/ be run cheaply).
My work on Canto VI begun with an investigation of the ring, but it is not necessarily the goal of the investigation but instead the tools available, that are the current digital utility, i.e. the tech is still being dragged along by the mind, no inversion yet. The imagined product really, at the moment, is artwork.
All tech details of this experiment available at https://github.com/POUNDIAN/down-to-the-ship.